Get A GeoJSON File

Geospatial data can be expressed in maps, which can offer amazing levels of data density.

You’re probably familiar with JSON, which is frequently used to store and pass data between applications. GeoJSON applies JSON structure to geospatial data in a single JSON file. Let’s get a map of Philly Census Tracts from the City of Philadelphia:

library(rgdal)
philadelphia_map <- readOGR('http://data.phl.opendata.arcgis.com/datasets/8bc0786524a4486bb3cf0f9862ad0fbf_0.geojson')
## OGR data source with driver: GeoJSON 
## Source: "http://data.phl.opendata.arcgis.com/datasets/8bc0786524a4486bb3cf0f9862ad0fbf_0.geojson", layer: "OGRGeoJSON"
## with 384 features
## It has 14 fields

We can explore this object using the Environment pane in RStudio, or we can use commands like str (structure) or head (show first few rows) to look into this object.

str(philadelphia_map, max.level = 2)
## Formal class 'SpatialPolygonsDataFrame' [package "sp"] with 5 slots
##   ..@ data       :'data.frame':  384 obs. of  14 variables:
##   ..@ polygons   :List of 384
##   ..@ plotOrder  : int [1:384] 299 344 46 368 304 253 383 127 188 134 ...
##   ..@ bbox       : num [1:2, 1:2] -75.3 39.9 -75 40.1
##   .. ..- attr(*, "dimnames")=List of 2
##   ..@ proj4string:Formal class 'CRS' [package "sp"] with 1 slot
head(philadelphia_map@data)
OBJECTID STATEFP10 COUNTYFP10 TRACTCE10 GEOID10 NAME10 NAMELSAD10 MTFCC10 FUNCSTAT10 ALAND10 AWATER10 INTPTLAT10 INTPTLON10 LOGRECNO
0 1 42 101 009400 42101009400 94 Census Tract 94 G5020 S 366717 0 +39.9632709 -075.2322437 10429
1 2 42 101 009500 42101009500 95 Census Tract 95 G5020 S 319070 0 +39.9658709 -075.2379140 10430
2 3 42 101 009600 42101009600 96 Census Tract 96 G5020 S 405273 0 +39.9655396 -075.2435075 10431
3 4 42 101 013800 42101013800 138 Census Tract 138 G5020 S 341256 0 +39.9764504 -075.1771771 10468
4 5 42 101 013900 42101013900 139 Census Tract 139 G5020 S 562934 0 +39.9750563 -075.1711846 10469
5 6 42 101 014000 42101014000 140 Census Tract 140 G5020 S 439802 0 +39.9735358 -075.1630966 10470

The important takeaway:

  • rgdal will ingest geoJSON or Shapefile data and give you back a “Spatial Polygons Data Frame”
  • This object type can be thought of as having two main sections: a data frame with tabular data (@data), and four elements which together make up the map (@polygons, @plotOrder, @bbox, @proj4string).

Make the simplest map possible

We’re going to use leaflet to create a dynamic map that users can interact with. It’s not going to be the prettiest!

library(leaflet)

leaflet(philadelphia_map) %>%
  addPolygons() 

We can change default colors and add labels that come from the @data part of our map object.

leaflet(philadelphia_map) %>%
    addPolygons(
    weight = 1,  # border thickness
    opacity = 0.5, # border opacity
    color = "grey", # border color
    fillColor = "white",
    fillOpacity = 1,
    label = philadelphia_map@data$NAMELSAD10
  )

At this point, let’s return to the slide deck!

Add Our Data!

Currently all we have is a boring map of the Census Tracts in Philadelphia. Let’s add some data to enrich this map with our own patient or research information.

For the purposes of this exercise, we’re going to use public data from the City of Philadelphia. See the download page for more details.

library(rgdal)
child_blood_lead <- read.csv("https://phl.carto.com/api/v2/sql?q=SELECT+*+FROM+child_blood_lead_levels_by_ct&filename=child_blood_lead_levels_by_ct&format=csv&skipfields=cartodb_id,the_geom,the_geom_webmercator", stringsAsFactors = FALSE)

Let’s peek at this data:

head(child_blood_lead)
census_tract data_redacted num_bll_5plus num_screen perc_5plus
42101000100 false 0 100 0
42101000200 true NA 109 NA
42101000300 true NA 110 NA
42101000401 true NA 61 NA
42101000402 false 0 41 0
42101000500 true NA 49 NA

Understanding this data:

  • The Census Tract listed here includes the state (42), county (101), and tract number all in one geographic ID. This is how you should ask for your tract data!
  • Some data is suppressed for k-anonymity purposes. If 1-5 children test at high blood lead levels, the precise number is suppressed.

Merging Tabular Data

WARNING

merge will reorder your rows, and that breaks the relationship between the rows of tabular data and the corresponding polygons. Not what you want! So we’re going to keep track of the order thanks to the helpful “OBJECTID” variable, which is super helpful. If you’re working with data that doesn’t have a column like this, just add it, so you can keep track of what the original order was.

Merging in this case is pretty simple – we just have to bring in the lead data and make sure our “hinge” (overlapping field) is set up properly:

merged <-  merge(x = philadelphia_map@data,
                 y = child_blood_lead,
                 by.x = "GEOID10",
                 by.y = "census_tract",
                 all = TRUE)

philadelphia_map@data <- merged[order(merged$OBJECTID),]

Plot Color-Coded Data (Choropleth)

Let’s see what our lead levels look like:

library(leaflet.extras)

lead_palette <- colorBin("Blues", domain = philadelphia_map@data$perc_5plus, bins = 10, na.color = "#aaaaaa")

leaflet(philadelphia_map) %>%
  addPolygons(
    weight = 1,  # border thickness
    opacity = 0.5, # border opacity
    color = "grey", # border color
    fillColor = ~lead_palette(philadelphia_map@data$perc_5plus),
    fillOpacity = 1,
    label = philadelphia_map@data$NAMELSAD10
  ) %>%
  suspendScroll()

At this point, let’s return to the slide deck!

Enrich!

And here we want to pull in our local file, which is a simplified version of data from the American Community Survey conducted by the Census Bureau. Let’s see what is contains.

economic_data <- read.csv("../Data/philly_census.csv")
head(economic_data)
census_tract pct_unemployed pct_commute_public_transit median_household_income mean_household_income pct_child_uninsured pct_families_below_poverty_line
42101000100 2.6 33.4 103772 124525 33.9 1.7
42101000200 7.6 22.0 50455 99337 6.5 13.7
42101000300 4.6 11.5 93036 121867 3.3 3.4
42101000401 5.9 21.8 57604 83354 0.0 23.2
42101000402 2.0 14.6 70038 92625 0.0 0.0
42101000500 12.3 20.6 40568 60813 9.1 0.0

This is selected economic characteristics of various census tracts. Let’s combine the data here with our map, and use labels to allow people to understand the data better:

merged_again <- merge(x=philadelphia_map@data,
                      y=economic_data,
                      by.x = "GEOID10",
                      by.y = "census_tract",
                      all = TRUE)

philadelphia_map@data <- merged_again[order(merged_again$OBJECTID),]


head(philadelphia_map@data)
GEOID10 OBJECTID STATEFP10 COUNTYFP10 TRACTCE10 NAME10 NAMELSAD10 MTFCC10 FUNCSTAT10 ALAND10 AWATER10 INTPTLAT10 INTPTLON10 LOGRECNO data_redacted num_bll_5plus num_screen perc_5plus pct_unemployed pct_commute_public_transit median_household_income mean_household_income pct_child_uninsured pct_families_below_poverty_line
95 42101009400 1 42 101 009400 94 Census Tract 94 G5020 S 366717 0 +39.9632709 -075.2322437 10429 false 14 306 4.6 19.2 49.1 18408 31068 4.6 41.5
96 42101009500 2 42 101 009500 95 Census Tract 95 G5020 S 319070 0 +39.9658709 -075.2379140 10430 false 11 253 4.3 12.3 53.2 27708 33932 20.0 32.9
97 42101009600 3 42 101 009600 96 Census Tract 96 G5020 S 405273 0 +39.9655396 -075.2435075 10431 false 14 314 4.5 15.2 39.4 24402 33784 1.2 39.1
134 42101013800 4 42 101 013800 138 Census Tract 138 G5020 S 341256 0 +39.9764504 -075.1771771 10468 false 17 157 10.8 15.8 19.6 28534 41352 4.8 24.2
135 42101013900 5 42 101 013900 139 Census Tract 139 G5020 S 562934 0 +39.9750563 -075.1711846 10469 false 14 264 5.3 13.4 37.8 14314 30200 1.4 47.7
136 42101014000 6 42 101 014000 140 Census Tract 140 G5020 S 439802 0 +39.9735358 -075.1630966 10470 false 7 140 5.0 11.7 49.0 24474 36123 3.7 33.4

Ideally we’d love to be able to have a dual-purpose choropleth that shows both a color-coded blood lead level layer and a color-coded poverty level layer. Let’s do that, using the keyword “group” and some layer controls.

Let’s first create some useful labels that will show when polygons are activated by hover:

labels <- sprintf(
  "<strong>%s</strong><br/>
  Families Below Poverty Line (%%): %g <br/>
  Children With High Blood Lead Levels (%%): %g",
  philadelphia_map@data$NAMELSAD10, 
  philadelphia_map@data$pct_families_below_poverty_line,
  philadelphia_map@data$perc_5plus
) %>% lapply(htmltools::HTML)

And now let’s map both poverty and lead data, using two addPolygon layers and some layer selection:

poverty_palette <- colorBin("Reds", domain = philadelphia_map@data$pct_families_below_poverty_line, bins = 10, na.color = "#cccccc")

leaflet(philadelphia_map) %>%
  addPolygons(
    weight = 1,  # border thickness
    opacity = 0.5, # border opacity
    color = "grey", # border color
    fillColor = ~lead_palette(philadelphia_map@data$perc_5plus),
    fillOpacity = 0.5,
    label = labels,
    labelOptions = labelOptions(
    style = list("font-weight" = "normal", padding = "3px 8px"),
    textsize = "15px",
    direction = "auto"),
    group = "Lead Level"
  ) %>%
    addPolygons(
    weight = 1,  # border thickness
    opacity = 0.5, # border opacity
    color = "grey", # border color
    fillColor = ~poverty_palette(philadelphia_map@data$pct_families_below_poverty_line),
    fillOpacity = 1,
    label = labels,
    labelOptions = labelOptions(
    style = list("font-weight" = "normal", padding = "3px 8px"),
    textsize = "15px",
    direction = "auto"),
    group = "Poverty Level"
  ) %>%
  addLayersControl(
    baseGroups = c("Lead Level", "Poverty Level"),
    options = layersControlOptions(collapsed = FALSE)
  ) %>%
  suspendScroll()

We could also add markers to indicate the most vulnerable areas – where child high blood lead was measured at over 15% and the poverty rate is over 25%:

library(dplyr)

most_vulnerable <- philadelphia_map@data %>% 
  mutate(lat = as.numeric(as.character(INTPTLAT10)), lng = as.numeric(as.character(INTPTLON10))) %>%
  filter(pct_families_below_poverty_line > 25, perc_5plus > 15)

leaflet(philadelphia_map) %>%
  addPolygons(
    weight = 1,  # border thickness
    opacity = 0.5, # border opacity
    color = "grey", # border color
    fillColor = ~lead_palette(philadelphia_map@data$perc_5plus),
    fillOpacity = 0.5,
    label = labels,
    labelOptions = labelOptions(
    style = list("font-weight" = "normal", padding = "3px 8px"),
    textsize = "15px",
    direction = "auto"),
    group = "Lead Level"
  ) %>%
    addPolygons(
    weight = 1,  # border thickness
    opacity = 0.5, # border opacity
    color = "grey", # border color
    fillColor = ~poverty_palette(philadelphia_map@data$pct_families_below_poverty_line),
    fillOpacity = 1,
    label = labels,
    labelOptions = labelOptions(
    style = list("font-weight" = "normal", padding = "3px 8px"),
    textsize = "15px",
    direction = "auto"),
    group = "Poverty Level"
  ) %>%
  addMarkers(lat=most_vulnerable$lat, lng=most_vulnerable$lng) %>%
  addLayersControl(
    baseGroups = c("Lead Level", "Poverty Level"),
    options = layersControlOptions(collapsed = FALSE)
  ) %>%
  suspendScroll()

Want to save this and pass this javascript-powered map to your web developer to put on your website?

library(htmlwidgets)
my_map <- leaflet(philadelphia_map) %>%
  addPolygons(
    weight = 1,  # border thickness
    opacity = 0.5, # border opacity
    color = "grey", # border color
    fillColor = ~lead_palette(philadelphia_map@data$perc_5plus),
    fillOpacity = 0.5,
    label = labels,
    labelOptions = labelOptions(
    style = list("font-weight" = "normal", padding = "3px 8px"),
    textsize = "15px",
    direction = "auto"),
    group = "Lead Level"
  ) %>%
    addPolygons(
    weight = 1,  # border thickness
    opacity = 0.5, # border opacity
    color = "grey", # border color
    fillColor = ~poverty_palette(philadelphia_map@data$pct_families_below_poverty_line),
    fillOpacity = 1,
    label = labels,
    labelOptions = labelOptions(
    style = list("font-weight" = "normal", padding = "3px 8px"),
    textsize = "15px",
    direction = "auto"),
    group = "Poverty Level"
  ) %>%
  addMarkers(lat=most_vulnerable$lat, lng=most_vulnerable$lng) %>%
  addLayersControl(
    baseGroups = c("Lead Level", "Poverty Level"),
    options = layersControlOptions(collapsed = FALSE)
  ) %>%
  suspendScroll()
saveWidget(my_map, file="../Media/my_map.html")